Formal Analysis of GPU Programs with Atomics via Conflict-Directed Delay-Bounding

نویسندگان

  • Wei-Fan Chiang
  • Ganesh Gopalakrishnan
  • Guodong Li
  • Zvonimir Rakamaric
چکیده

GPU based computing has made significant strides in recent years. Unfortunately, GPU program optimizations can introduce subtle concurrency errors, and so incisive formal bug-hunting methods are essential. This paper presents a new formal bug-hunting method for GPU programs that combine barriers and atomics. We present an algorithm called conflict-directed delay-bounded scheduling algorithm (CD) that exploits the occurrence of conflicts among atomic synchronization commands to trigger the generation of alternate schedules; these alternate schedules are executed in a delay-bounded manner. We formally describe CD, and present two correctness checking methods, one based on final state comparison, and the other on user assertions. We evaluate our implementation on realistic GPU benchmarks, with encouraging results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Warps and Atomics: Beyond Barrier Synchronization in the Verification of GPU Kernels

We describe the design and implementation of methods to support reasoning about data races in GPU kernels where constructs other than the standard barrier primitive are used for synchronization. At one extreme we consider kernels that exploit implicit, coarse-grained synchronization between threads in the same warp, a feature provided by many architectures. At the other extreme we consider kern...

متن کامل

GPU-accelerated Hausdorff distance computation between dynamic deformable NURBS surfaces

We present a parallel GPU-accelerated algorithm for computing the directed Hausdorff distance from one NURBS surface to another, within a bound. We make use of axis-aligned bounding-box hierarchies that bound the NURBS surfaces to accelerate the computations. We dynamically construct as well as traverse the bounding-box hierarchies for the NURBS surfaces using operations that are optimized for ...

متن کامل

Parallel Irradiance Caching on the Gpu

While ray tracing is highly parallelizable in concept, the Radiance suite of programs for architectural global illumination simulation was written for serial execution and makes use of certain heuristic techniques that are not easily performed in parallel environments. It uses irradiance caching to store and reuse the results of expensive indirect irradiation computations. The irradiance cache ...

متن کامل

Implementation of the direction of arrival estimation algorithms by means of GPU-parallel processing in the Kuda environment (Research Article)

Direction-of-arrival (DOA) estimation of audio signals is critical in different areas, including electronic war, sonar, etc. The beamforming methods like Minimum Variance Distortionless Response (MVDR), Delay-and-Sum (DAS), and subspace-based Multiple Signal Classification (MUSIC) are the most known DOA estimation techniques. The mentioned methods have high computational complexity. Hence using...

متن کامل

Stability analysis and feedback control of T-S fuzzy hyperbolic delay model for a class of nonlinear systems with time-varying delay

In this paper, a new T-S fuzzy hyperbolic delay model for a class of nonlinear systems with time-varying delay, is presented to address the problems of stability analysis and feedback control. Fuzzy controller is designed based on the parallel distributed compensation (PDC), and with a new Lyapunov function, delay dependent asymptotic stability conditions of the closed-loop system are derived v...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013